707 research outputs found

    Frameless Representation and Manipulation of Image Data

    Get PDF
    Most image sensors mimic film, integrating light during an exposure interval and then reading the latent image as a complete frame. In contrast, frameless image capture attempts to construct a continuous waveform for each sensel describing how the Ev (exposure value required at each pixel) changes over time. This allows great flexibility in computationally extracting frames after exposure. An overview of how this could be accomplished was presented at EI2014, with an emphasis on frameless sensor technology. In contrast, the current work centers on deriving frameless data from sequences of conventionally captured frames

    Extending Static Synchronization Beyond SIMD and VLIW

    Get PDF
    A key advantage of SIMD (Single Instruction stream, Multiple Data stream) architectures is that synchronization is effected statically at compile-time, hence the execution-time cost of synchronization between “processes” is essentially zero. VLIW (Very Long Instruction Word) machines are successful in large part because they preserve this property while providing more flexibility in terms of what kinds of operations can be parallelized. In this paper, we propose a new kind of architecture —- the “static barrier MIMD” or SBM — which can be viewed as a further generalization of the parallel execution abilities of static synchronization machines. Barrier MIMDs are asynchronous Multiple Instruction stream Multiple Data stream architectures capable of parallel execution of loops, subprogram calls, and variable execution- time instructions; however, little or no run-time synchronization is needed. When a group of processors within a barrier MIMD has just encountered a barrier, any conceptual synchronizations between the processors are statically accomplished with zero cost — as in a SIMD or VLIW and using similar compiler technology. Unlike these machines, however, as execution continues the relative timing of processors may become less precisely knowable as a static, compile-time, quantity. Where this imprecision becomes too large, the compiler simply inserts a synchronization barrier to insure that timing imprecision at that point is zero, and again employs purely static, implicit, synchronization. Both the architecture and the supporting compiler technology are discussed in detail

    Data Layout Optimization and Code Transformation for Paged Memory Systems

    Get PDF
    Supercomputers need not only to have fast functional units, but also to have rapid access to massive quantities of data. Virtual memory paging and physically distributed memory systems both attempt to provide this large data space, but performance of a computer system using either memory organization is highly dependent on the page reference pattern and the number of pages available locally. Despite this, surprisingly little work has been done toward using the compiler to optimize memory system performance. In this paper, we introduce compiler techniques which use a combination of data layout and code transformation to improve paging performance for compiled programs. These same techniques can also be applied manually to improve performance using existing compilers

    Automatic Parallelization of Database Queries

    Get PDF
    Although automatic parallelization of conventional language programs is now widely accepted, relatively little emphasis has been placed on automatic parallelization of database query programs (sometimes referred to as “multiple queries” ). In this paper, we discuss the unique problems associated with automatic parallelization of database programs. From this discussion, we derive a complete approach to automatic parallelization of database programs. Beside integrating a number of existing techniques, our approach relies heavily on several new concepts, including the concepts of “algorithm-level” analysis and hybrid static/dynamic scheduling

    Compiler-Driven Cache Policy (Known Reference String)

    Get PDF
    Increasing cache hit-ratios has proved to be instrumental in improving performance of cache-based computers. This is particularly true for computers which have a high cache-miss/cache-hit memory reference delay ratio. Although software policies are often used for main vs. secondary memory caching , the speed required for an implementation of a CPU vs. main memory cache policy has prompted only investigation of policies which can be implemented directly in hardware. Based on compile-time analysis, it is possible to predict program behavior, thereby increasing the hit-ratio beyond the capability of pure run-time (hardware) techniques. In this report, compiler-driven techniques for this kind of cache policy are described. The SCP Model (software cache policy model) provides an optimal cache prefetch and placement/replacement policy when given an arbitrary memory reference string. In addition to suggesting a simplified cache hardware model, the SCP Model can be applied to various cache organizations such as direct mapping, set associative, and full associative. Analytic results demonstrate significant improvements in cache performance. The current work discusses an optimal cache policy which applies where the string of references is known at compile time. However, this constraint can be relaxed to encompass reference strings which are known only statistically, i.e., reference strings in which data aliases make the target of some references ambiguous. Companion reports, currently in preparation, detail the extension of the SCP Model to incorporate aliases, code incorporating loops, and conditional branches

    Algorithm Choice For Multiple-Query Evaluation

    Get PDF
    Traditional query optimization concentrates on the optimization of the execution of each individual query. More recently, it has been observed that by considering a sequence of multiple queries some additional high-level optimizations can be performed. Once these optimizations have been performed, each operation is translated into executable code. The fundamental insight in this paper is that significant improvements can be gained by careful choice of the algorithm to be used for each operation. This choice is not merely based on efficiency of algorithms for individual operations, but rather on the efficiency of the algorithm choices for the entire multiple-query evaluation. An efficient procedure for automatically optimizing these algorithm choices is given

    Hardware Barrier Synchronization: Static Barrier MIMD (SBM)

    Get PDF
    In this paper, we give the design, and performance analysis, of a new, highly efficient, synchronization mechanism called “Static Barrier MIMD” or “SBM.” Unlike traditional barrier synchronization, the proposed barriers are designed to facilitate the use of static (compile-time) code scheduling for eliminating some synchronizations. For this reason, our barrier hardware is more general than most hardware barrier mechanisms, allowing any subset of the processors to participate in each barrier. Since code scheduling typically operates on fine-grain parallelism, it is also vital that barriers be able to execute in a small number of clock ticks. The SBM is actually only one of two new classes of barrier machines proposed to facilitate static code scheduling; the other architecture is the “Dynamic Barrier MIMD,” or “DBM,” which is described in a companion paper1. The DBM differs from the SBM in that the DBM employs more complex hardware to make the system less dependent on the precision of the static analysis and code scheduling; for example, an SBM cannot efficiently manage simultaneous execution of independent parallel programs, whereas a DBM can

    Loop Coalescing and Scheduling for Barrier MIMD Architectures

    Get PDF
    Barrier MIMDs are asynchronous Multiple Instruction stream Multiple Data stream architectures capable of parallel execution of variable execution time instructions and arbitrary control flow (e.g., while loops and calls); however, they differ from conventional MlMDs in that the need for run-time synchronization is significantly reduced. This work considers the problem of scheduling nested loop structures on a barrier MIMD. The basic approach employs loop coalescing, a technique for transforming a multiply-nested loop into a single loop. Loop coalescing is extended to nested triangular loops, in which inner loop bounds are functions of outer loop indices. Also, a more efficient scheme to generate the original loop indices from the coalesced index is proposed for the case of constant loop bounds. These results are general, and can be applied to extend previous work using loop coalescing techniques. We concentrate on using loop coalescing for scheduling barrier MIMDs, and show how previous work in loop transformations [Wol89], [Pol88] and linear scheduling theory [ShF88], rShO901 cart be applied to this problem

    System and Method for 3D Imaging using Structured Light Illumination

    Get PDF
    A biometrics system captures and processes a handprint image using a structured light illumination to create a 2D representation equivalent of a rolled inked handprint. The biometrics system includes an enclosure with a scan volume for placement of the hand. A reference plane with a backdrop pattern forms one side of the scan volume. The backdrop pattern is preferably a random noise pattern and the coordinates of the backdrop pattern are predetermined at system provisioning. The biometrics system further includes at least one projection unit for projecting a structured light pattern onto a hand positioned in the scan volume on or in front of the backdrop pattern and at least two cameras for capturing a plurality of images of the hand, wherein each of the plurality of images includes at least a portion of the hand and the backdrop pattern. A processing unit calculates 3D coordinates of the hand from the plurality of images using the predetermined coordinates of the backdrop pattern to align the plurality of images and mapping the 3D coordinates to a 2D flat surface to create a 2D representation equivalent of a rolled inked handprint. The processing unit can also adjust calibration parameters for each hand scan from calculating coordinates of the portion of backdrop pattern in the at least one image and comparing with the predetermined coordinates of the backdrop pattern

    System and Method for 3D Imaging using Structured Light Illumination

    Get PDF
    A biometrics system captures and processes a handprint image using a structured light illumination to create a 2D representation equivalent of a rolled inked handprint. A processing unit calculates 3D coordinates of the hand from the plurality of images and maps the 3D coordinates to a 2D flat surface to create a 2D representation equivalent of a rolled inked handprint
    • …
    corecore